Optical Character Recognition through Character-set Dependent Probabilities
نویسنده
چکیده
While optical character recognition is field where machine learning algorithms are easily applied, it might not always be cost effective to do so. While Viola and Jones explored Haar features in their face detection algorithm, selecting from the thousands of features is quite time consuming. This paper compares their feature set with a baseline of pixel learners as well as another set of Haar features that were selected based on knowledge of the character set. While probability-chosen Haar features are a computational improvement over using Adaboost-selected Haar features, they don't perform as well as the raw pixel features, though there is room for improvement.
منابع مشابه
Handwritten Character Recognition using Conditional Probabilities
Handwritten Character Recognition is an important part of Pattern Recognition. This is also referred to as Intelligent Character Recognition (ICR). In this paper, a conditional probability based combination of multiple recognizers for character recognition will be introduced. After preprocessing the given character image, different feature recognition algorithms are employed, and their performa...
متن کاملMulti-level post-processing for Korean character recognition using morphological analysis and linguistic evaluation
Most of the post-processing methods for character recognition rely on contextual information of character and word-fragment levels. However, due to linguistic characteristics of Korean, such low-level information alone is not sufficient for high-quality character-recognition applications, and we need much higher-level contextual information to improve the recognition results. This paper present...
متن کاملA Character Recognition Application of an Iterative Procedure for Feature Selection
A simple yet powerful technique for selecting features to be used in a pattern recognition system has been devised and applied to an eight-class character recognition problem using a set of 19 000 typed character samples as a data base. High-order joint probabilities have been directly estimated from the data base, thus making it possible to take into account in the feature selection process th...
متن کاملEnhanced Good-Turing and Cat.Cal: Two New Methods for Estimating Probabilities of English Bigrams (abbreviated version)
For many pattern recognition applications including speech recognition and optical character recognition, prior models of language are used to disambiguate otherwise equally probable outputs. It is common practice to use tables of probabilities of single words, pairs of words, and triples of words (n-grams) as a prior model. Our research is directed to 'backing-off' methods, that is, methods th...
متن کاملBounding the Probability of Error for High Precision Optical Character Recognition
We consider a model for which it is important, early in processing, to estimate some variables with high precision, but perhaps at relatively low recall. If some variables can be identified with near certainty, they can be conditioned upon, allowing further inference to be done efficiently. Specifically, we consider optical character recognition (OCR) systems that can be bootstrapped by identif...
متن کامل